Search CORE

973 research outputs found

Character-level Convolutional Networks for Text Classification

Author: LeCun Yann
Zhang Xiang
Zhao Junbo
Publication venue
Publication date: 01/01/2015
Field of study

This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.Comment: An early version of this work entitled "Text Understanding from Scratch" was posted in Feb 2015 as arXiv:1502.01710. The present paper has considerably more experimental results and a rewritten introduction, Advances in Neural Information Processing Systems 28 (NIPS 2015

arXiv.org e-Print Archive

CiteSeerX

Optimal dual martingales, their analysis and application to new algorithms for Bermudan products

Author: Huang Junbo
Schoenmakers John
Zhang Jianing
Publication venue
Publication date: 01/01/2012
Field of study

In this paper we introduce and study the concept of optimal and surely optimal dual martingales in the context of dual valuation of Bermudan options, and outline the development of new algorithms in this context. We provide a characterization theorem, a theorem which gives conditions for a martingale to be surely optimal, and a stability theorem concerning martingales which are near to be surely optimal in a sense. Guided by these results we develop a framework of backward algorithms for constructing such a martingale. In turn this martingale may then be utilized for computing an upper bound of the Bermudan product. The methodology is pure dual in the sense that it doesn't require certain input approximations to the Snell envelope. In an It\^o-L\'evy environment we outline a particular regression based backward algorithm which allows for computing dual upper bounds without nested Monte Carlo simulation. Moreover, as a by-product this algorithm also provides approximations to the continuation values of the product, which in turn determine a stopping policy. Hence, we may obtain lower bounds at the same time. In a first numerical study we demonstrate the backward dual regression algorithm in a Wiener environment at well known benchmark examples. It turns out that the method is at least comparable to the one in Belomestny et. al. (2009) regarding accuracy, but regarding computational robustness there are even several advantages.Comment: This paper is an extended version of Schoenmakers and Huang, "Optimal dual martingales and their stability; fast evaluation of Bermudan products via dual backward regression", WIAS Preprint 157

arXiv.org e-Print Archive

CiteSeerX

Crossref

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Attention-Based End-to-End Speech Recognition on Voice Search

Author: Shan Changhao
Wang Yujun
Xie Lei
Zhang Junbo
Publication venue
Publication date: 13/02/2018
Field of study

Recently, there has been a growing interest in end-to-end speech recognition that directly transcribes speech to text without any predefined alignments. In this paper, we explore the use of attention-based encoder-decoder model for Mandarin speech recognition on a voice search task. Previous attempts have shown that applying attention-based encoder-decoder to Mandarin speech recognition was quite difficult due to the logographic orthography of Mandarin, the large vocabulary and the conditional dependency of the attention model. In this paper, we use character embedding to deal with the large vocabulary. Several tricks are used for effective model training, including L2 regularization, Gaussian weight noise and frame skipping. We compare two attention mechanisms and use attention smoothing to cover long context in the attention model. Taken together, these tricks allow us to finally achieve a character error rate (CER) of 3.58% and a sentence error rate (SER) of 7.43% on the MiTV voice search dataset. While together with a trigram language model, CER and SER reach 2.81% and 5.77%, respectively

arXiv.org e-Print Archive

Crossref